3 research outputs found
Randomly Initialized Subnetworks with Iterative Weight Recycling
The Multi-Prize Lottery Ticket Hypothesis posits that randomly initialized
neural networks contain several subnetworks that achieve comparable accuracy to
fully trained models of the same architecture. However, current methods require
that the network is sufficiently overparameterized. In this work, we propose a
modification to two state-of-the-art algorithms (Edge-Popup and Biprop) that
finds high-accuracy subnetworks with no additional storage cost or scaling. The
algorithm, Iterative Weight Recycling, identifies subsets of important weights
within a randomly initialized network for intra-layer reuse. Empirically we
show improvements on smaller network architectures and higher prune rates,
finding that model sparsity can be increased through the "recycling" of
existing weights. In addition to Iterative Weight Recycling, we complement the
Multi-Prize Lottery Ticket Hypothesis with a reciprocal finding: high-accuracy,
randomly initialized subnetwork's produce diverse masks, despite being
generated with the same hyperparameter's and pruning strategy. We explore the
landscapes of these masks, which show high variability
Utilizing Network Properties to Detect Erroneous Inputs
Neural networks are vulnerable to a wide range of erroneous inputs such as
adversarial, corrupted, out-of-distribution, and misclassified examples. In
this work, we train a linear SVM classifier to detect these four types of
erroneous data using hidden and softmax feature vectors of pre-trained neural
networks. Our results indicate that these faulty data types generally exhibit
linearly separable activation properties from correct examples, giving us the
ability to reject bad inputs with no extra training or overhead. We
experimentally validate our findings across a diverse range of datasets,
domains, pre-trained models, and adversarial attacks
Cross-Silo Federated Learning Across Divergent Domains with Iterative Parameter Alignment
Learning from the collective knowledge of data dispersed across private
sources can provide neural networks with enhanced generalization capabilities.
Federated learning, a method for collaboratively training a machine learning
model across remote clients, achieves this by combining client models via the
orchestration of a central server. However, current approaches face two
critical limitations: i) they struggle to converge when client domains are
sufficiently different, and ii) current aggregation techniques produce an
identical global model for each client. In this work, we address these issues
by reformulating the typical federated learning setup: rather than learning a
single global model, we learn N models each optimized for a common objective.
To achieve this, we apply a weighted distance minimization to model parameters
shared in a peer-to-peer topology. The resulting framework, Iterative Parameter
Alignment, applies naturally to the cross-silo setting, and has the following
properties: (i) a unique solution for each participant, with the option to
globally converge each model in the federation, and (ii) an optional
early-stopping mechanism to elicit fairness among peers in collaborative
learning settings. These characteristics jointly provide a flexible new
framework for iteratively learning from peer models trained on disparate
datasets. We find that the technique achieves competitive results on a variety
of data partitions compared to state-of-the-art approaches. Further, we show
that the method is robust to divergent domains (i.e. disjoint classes across
peers) where existing approaches struggle.Comment: Published at IEEE Big Data 202